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PROGRAMMATIC DESIGN SPACE EXPLORATION THROUGH VALIDITY 
FILTERING AND QUALITY FILTERING 

This patent application is related to the following co-pending U.S. 
Patent applications, commonly assigned and filed on August 20, 1999: 
U.S. Patent Application No. 09/378,596, entitled AUTOMATIC DESIGN 
OF PROCESSOR DATAPATHS, by Shail Aditya Gupta and Bantwal 
Ramakrishna Rau; 

U.S. Patent Application No. 09/378,293, entitled AUTOMATIC DESIGN 
OF VLIW INSTRUCTION FORMATS, by Shail Aditya Gupta, Bantwal 
Ramakrishna Rau, Richard Craig Johnson, and Michael S. Schlansker; 
U.S. Patent Application No. 09/378,601, entitled PROGRAMMATIC 
SYNTHESIS OF A MACHINE DESCRIPTION FOR RETARGETING A 
COMPILER, by Shail Aditya Gupta; 

U.S. Patent Application No. 09/378,395, entitled AUTOMATIC DESIGN 
OF VLIW PROCESSORS, by Shail Aditya Gupta, Bantwal Ramakrishna 
Rau, and Vinod Kumar Kathail; 

U.S. Patent Application No. 09/378,298, entitled PROGRAMMATIC 
SYNTHESIS OF PROCESSOR ELEMENT ARRAYS, by Robert Schreiber, 
Shail Aditya Gupta, Vinod Kumar Kathail, Sadun Anik, and Bantwal 
Ramakrishna Rau; 

U.S. Patent Application No. 09/378,397, entitled PROGRAMMATIC 
METHOD FOR REDUCING COST OF CONTROL IN PARALLEL 
PROCESSES, by Alain Darte and Robert Schreiber; 
U.S. Patent Application No. 09/378,431, entitled FUNCTION UNIT 
ALLOCATION IN PROCESSOR DESIGN, by Robert Schreiber; 
U.S. Patent Application No. 09/378,295, entitled INTERCONNECT 
MINIMIZATION IN PROCESSOR DESIGN, by Robert Schreiber; 
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U.S. Patent Application No. 09/378,394, entitled AUTOMATED DESIGN 
OF PROCESSOR INSTRUCTION UNITS, by Shail Aclitya Gupta and 
Bantwal Ramakrishna Rau; 

U.S. Patent Application No. 09/378,393, entitled PROGRAMMATIC 
ITERATION SCHEDULING FOR PARALLEL PROCESSORS, by Robert S. 
Schreiber, Bantwal Ramakrishna Rau, and Alain Darte; and 
U.S. Patent Application No. 09/378,290, entitled AUTOMATED DESIGN 
OF PROCESSOR SYSTEMS USING FEEDBACK FROM INTERNAL 
MEASUREMENTS OF CANDIDATE SYSTEMS, by Mike Schlansker, 
Vinod Kathail, Greg Snider, Shail Aditya Gupta, Scott A. Mahlke, and 
Santosh G. Abraham. 

The above patent applications are hereby incorporated by reference. 

Technical Field 

The invention pertains to programmatic methods for the 
preparation of sets of valid, superior system designs for processor 
systems, components of processor systems, and other systems 
characterized by discrete parameters. 

Background of the Invention 

Embedded computer systems are used in a wide range of 
electronic devices and other equipment, including mobile phones, 
printers, and cars. These devices are not usually regarded as computer 
systems, but they nevertheless rely heavily on embedded computer 
systems to provide key functions, functionality, and features. In many 
cases, the required computing capabilities of such embedded systems 
match or exceed the capabilities required of general-purpose computers. 
Furthermore, embedded systems must often meet severe cost and power 
dissipation requirements. The number of embedded computers far 
exceeds the number of more general-purpose computer systems such as 
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PCs or servers and the total value of these embedded computers will 
eventually exceed that of general-purpose computer systems. 

The design process for embedded computers differs from that of 
general-purpose computer systems. The embedded computer systems 
have greater design freedom than general-purpose computers because 
there is little need to adhere to existing standards to run existing 
software. In addition, since embedded computers are used in specific 
settings, they can be custom-tuned to a greater degree than a general 
purpose computer. On the other hand, total sales of a particular 
embedded computer system are typically insufficient to support a full 
custom design. Therefore, although there is a greater freedom to 
customize and the benefits of customization are large, the available 
design budget is limited. Therefore, automated design tools are needed 
to capture the benefits of customization while maintaining a low design 
cost. 

The specification of an embedded computer system includes 
specifications of design parameters for several subsystems. For 
example, a cache memory can include a unified cache or a split-cache, 
and these caches can be specified in terms of a cache size, associativity, 
line size, and number of ports. For example, cache memory design can 
be specified as an 8 kB 2-way set associative cache with a line size of 
32 bytes. The evaluation of cache designs is time-consuming because 
of the complexity of processor and cache simulation, in addition, the 
size of the embedded processor design space increases combinatorially 
with the number of design parameters. As a result, an exhaustive 
exploration of a typical embedded processor design space is infeasible 
and improved methods for evaluating designs are needed. 

Many other complex systems encounter similar problems. 
Evaluation of system designs can be slow and expensive, or determining 
whether a particular combination of design parameters yields a valid 
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design can be difficult. Accordingly, improved methods for identifying 
valid system designs and determining how well various designs satisfy 
evaluation criteria are needed. 

Summary of the Invention 

Programmatic methods for obtaining validity sets and quality sets 
of system designs from a design space of designs are provided. For a 
hierarchical system, component validity filters produce component 
validity sets. A system validity set is obtained that is a Cartesian 
product of the component validity sets. In a specific embodiment, 
component designs are specified by component parameters, and the 
component validity filters are independent of component parameters of 
other components, and a system validity filter is applied to the Cartesian 
product of the component validity sets. 

In another specific embodiment, component validity sets for each 
of the component designs are obtained by applying component validity 
filters that are defined by corresponding component validity predicates. 
Component evaluation functions and component quality filters are 
applied to the component validity sets to form component quality sets. 
A set of systems designs is then produced that corresponds to a 
Cartesian product of the component quality sets. In one example 
embodiment, a system evaluation function and a system quality filter are 
applied to the set of system designs thus obtained. 

In a further specific embodiment, system designs are 
programmatically selected by selecting and applying a system validity 
filter to the system designs. The system validity filter is defined by a 
system validity predicate and a set of selected system designs is 
produced containing only system designs that satisfy the system validity 
predicate. In a further embodiment, the system validity predicate is a 
product of partial validity predicates that are mutually exclusive. 
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In a method of programmatically selecting a set of selected 
system designs, a system validity filter is selected that is defined by a 
system validity predicate. The system validity predicate includes one or 
more partial validity predicates that define partial validity filters. The 
partial validity filters are applied to the system designs to form partial 
validity sets that include system designs satisfying respective partial 
validity filters. An evaluation function is applied to the system designs 
of the partial validity sets to produce an evaluation metric for each 
system design. A quality filter produces respective partial quality sets 
that are combined to produce a first quality set. In a specific 
embodiment, the partial validity predicates are mutually exclusive and 
the system validity predicate is a product of the partial validity 
predicates. In a further specific embodiment, the quality filter is applied 
to the first quality set to produce a second quality set. 

A method of programmatically selecting a design for a cache 
memory is also disclosed. Components for the cache memory are 
selected and component Pareto sets are prepared. A combined Pareto 
set is prepared from the component Pareto sets, and a cache memory 
design is selected from the combined Pareto set. 

Further features of the invention will become apparent from the 
following detailed description and accompanying drawings. 
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Brief Description of the Drawings 

FIG. 1 illustrates a computer program that produces a validity set 
from a design space. 
5 FIG. 2 illustrates a computer program that includes validity filters 

and quality filters. 

FIG. 3 illustrates a computer program that uses mutually exclusive 
validity predicates to produce two validity sets. 

FIG. 4 illustrates a computer program that includes component 
1 0 validity filters that are applied to form component validity sets that are 
combined to produce a system validity set. 

FIG, 5 illustrates a computer program that includes component 
validity filters that produce component validity sets that are combined to 
form a first system validity set to which a system validity filter is applied 
1 5 to produce a second system validity set. 

FIG. 6 illustrates a computer program that performs validity and 
quality filtering on component design spaces and produces a set of 
system designs that is then filtered by a system quality filter. 

FIG. 7 illustrates a computer program that performs validity 
20 filtering on component design spaces to produce component validity 

sets, combines the component validity sets to produce a system validity 
set, and then applies system validity and quality filters. 

FIG. 8 illustrates a computer program that produces a validity set 
from component design spaces. 
25 FIG. 9 illustrates a computer program that produces a quality set 

from component design spaces. 

FIG. 10 illustrates a computer program that produces a quality set 
from component design spaces. 

FIG. 1 1 shows a mapping of designs into a time/cost plane. 
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FIG. 1 2 is a block diagram of a processor system that includes a 
cache memory, a VLIW processor, and a systolic array. 

FIG. 13 contains a Pareto curve for the instruction cache of FIG. 

12. 

5 FIG. 14 contains a Pareto curve for the data cache of FIG. 12. 

FIG. 1 5 contains a Pareto curve for the unified cache of FIG. 1 2. 
FIG. 1 6 contains a Pareto curve for the cache memory of FIG. 1 2. 
FIG. 17 contains Pareto curves illustrating the programmatic 
selection of a design for the processor system of FIG. 1 2. 
10 FIG. 18 contains a Pareto curve for the VLIW processor of FIG. 

12. 

FIG. 1 9 contains a Pareto curve for the processor system of FIG. 

12. 



1 5 Definitions 

For convenience, the following list of definitions of terms used 
herein is provided: 
Design Space 

A design space is a set of designs for a system. 
20 Discrete Design Parameter 

A discrete design parameter is a parameter that at least partially 
specifies a portion of a design and that assumes a discrete set of values, 
for example, Boolean values, integer values, sets, graphs, etc. As used 
herein, a system is specified by discrete parameters. 
25 Programmatic 

The term "programmatic" means performed by a program 
implemented in either software or hardware. The methods described 
below are implemented in programs stored on a computer readable 
medium. A computer readable medium is a generic term for memory 
30 devices commonly used to store program instructions and data in a 
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computer and for memory devices used to distribute programs (e.g., a 
CD-ROM). 

Component 

A component is a part of a system. A system can comprise one 
or more components. 

Component Design 

A component design is a design for a component of a system. A 
component might, itself, be a system that has components. 
Composition 

A composition is a construction of a system design from 
component designs. 

Hierarchical Design Space 

A design space in which each design includes a set of component 
designs and in which each of the component designs can be a system 
design. 

Term 

A Boolean-valued relation (e.g., greater than, less than, equal) 
between two expressions involving discrete parameters characterizing a 
design. 

Singleton Term 

A term involving only parameters corresponding to a single 
component. 

Coupled Term 

A term involving parameters corresponding to multiple 
components. 

Common Term 

A logical term in a system validity function V(), expressed in 
canonical form, that occurs in all AND expressions of the system validity 
function V() and includes only singleton terms. Component parameters 
appearing only in common terms are referred to as common parameters. 



HP1 0990408-1 9 

Express Mail No. EM295378042US 

Partial Term 

A term in a system validity function V() that is not a common 

term. 

5 Validity Predicate 

A Boolean function constructed from Boolean terms. A design is a 
valid design if and only if a corresponding validity predicate evaluates to 
TRUE for the parameters of that design. 
Validity Filter 

!0 A function, defined by a validity predicate, whose input and 

output are both sets of designs. The output set only contains those 
designs in the input set for which the validity predicate is TRUE. Also, a 
function that identifies a design as satisfying a validity predicate. 
Product Form Predicate 
1 5 A predicate which is the conjunction of multiple Boolean 

expressions, wherein each Boolean expression contains terms that 
involve the parameters of only one component. 
Validity Set 

A set of designs obtained by application of a validity filter. 
20 Evaluation Metric 

The vector of metrics defining the quality (e.g., performance, cost, 
size, etc.) of a design. 

System Evaluation Metric 
An evaluation metric for a system design. 
25 Component Evaluation Metric 

An evaluation metric for a component design. 
Evaluation Function 

A formula or procedure for computing a vector-valued evaluation 
metric for a given design. An evaluation function can consist of, for 
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example, the evaluation of a formula or the execution of a computer 
program, or simulation of the execution of a computer program. 
System Evaluation Function 

An evaluation function that is applied to system designs. 
Component Evaluation Function 

An evaluation function that is applied to component designs. 
Comparison Function 

A function that compares evaluation metrics for two or more 
designs. A comparison function that compares designs A and B 
generally returns one of four answers: (1) A is better than B; (2) B is 
better than A; (3) A and B are equally good; (4) neither A nor B can be 
said to be better than the other 

Correlated Evaluation Function 

A component evaluation function is correlated with a system 
evaluation function if the following is true most of the time, and when it 
is not the extent to which it is false is generally small. If the component 
evaluation function indicates that a component B is worse than a 
component A of the same type, then the system evaluation function will 
indicate that any system containing B is worse than the same system 
but with B replaced by A. 

Monotonicity 

A monotonically non-decreasing function is defined as a function 
whose value does not decrease for any increase in the value of its 
arguments. A monotonic decomposition is a system decomposition into 
components wherein a system quality function is a monotonically non- 
decreasing function of component parameters. 

Pareto Set 

A set of all designs such that there is no other design in the 
design space better than any one of them. 
Quality Set 
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A Pareto set or some acceptable approximation to a Pareto set. 
Quality Design 

A design that is an element of a quality set. 
Quality Filter 

5 A function that computes a quality set from a set of designs, or 

identifies a design as a quality design. 

Abstract Instruction Set Architecture Specification 
An Abstract Instruction Set Architecture (ISA) Specification is an 
abstract specification of a processor design and may include the 
10 following: 

an opcode repertoire, possibly structured as operation sets; 
a specification of the I/O format for each opcode; 
a register file specification, including register file types and the 
number of each type; 
15 a specification of the desired instruction level parallelism (ILP) 

constraints, making use of some form of concurrency sets, exclusion 
sets or a combination of concurrency and exclusion sets, that specifies 
which sets of operation groups/opcodes can be issued concurrently; and 
other optional architecture parameters, e.g., presence/absence of 
20 predication, speculation, etc. 



Detailed Description 

The identification of superior designs for a complex system having 
a large design space can be time-consuming and expensive. The designs 

25 of many systems of practical interest are characterized by one or more 
(typically very many) discrete design parameters. Example of such 
systems include computer systems and other digital electronic systems. 
A typical discrete parameter for such systems is memory size because 
memory contains integer numbers of bits and is frequently restricted to 

30 numbers of bits or bytes that are powers of two. 
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Quality filtering is described below with reference to processor 
systems such as very long instruction word (VLIW) processor systems 
and other processor systems as a specific illustrative example. The 
design of processor systems involves choosing designs for numerous 
5 subsystems of the processor system. Because there are many design 
variables and the evaluation of even a single design can be expensive 
and time consuming, exploring all possible designs is generally infeasible. 
Accordingly, validity and quality filtering can reduce the time and money 
spent on system design. In addition, programmatic quality filtering can 

10 replace design selection based on designer "hunches" that do not 

necessarily discover superior designs. In some cases, VLIW processor 
design is simplified by decomposing the processor system into 
subsystems, referred to herein as "components." Designs for the 
components are validity and quality filtered. 

1 5 Processor system designs can include a processor, a cache 

memory, and a systolic array. In some applications, the processor is a 
VLIW processor that is specified by an abstract ISA specification that 
includes a data set that contains specifications for predication, 
speculation, numbers and types of registers, numbers and types of 

20 functional units, and literal widths for memory literals, branch literals, 
and integer data literals. In the examples discussed below in which 
execution time is selected as a performance criterion, sufficient 
processor data is provided to permit the simulated execution of an 
application program on a selected processor design. Cache memory can 

25 include a level 1 data cache, a level 1 instruction cache, and a level 2 
unified cache. Each of these caches can be specified with parameters 
for the number of ports, cache size, line size, and associativity. A 
systolic array can be specified by shape, bandwidth, and mapping 
direction. 
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For convenience, a design space (D) is defined as a set of designs 
(d) for an embedded processor system, a very long instruction word 
(VLIW) processor system, a cache memory, or other system of interest. 
The design space D can be limited by design constraints, such as a total 
5 substrate area available for a processor or other components, total 
available power, or other design constraint. Superior designs in the 
design space D are to be identified and a particular design selected for 
implementation. Generally a design d of the design space D is evaluated 
in terms of appropriate performance criteria. For processor systems 

10 including embedded processor systems, VLIW processor systems, and 
components thereof (such as cache memory), two primary performance 
criteria are cost of the design and execution time of the design. Cost 
can be measured as an actual manufacturing cost but is conveniently 
represented as a substrate area required to implement the design. The 

1 5 execution time is a time required for a component of the system of 
interest to complete a task associated with that component. For 
example, the execution time associated with a cache memory is the 
additional execution time required due to the selected cache design. 
The execution time is determined by calculating, measuring, or 

20 estimating the time required to execute a benchmark application using 
benchmark data. The selected benchmark application usually is chosen 
to impose demands on the processor system or components similar to 
the demands imposed by the intended applications of the processor 
system. 

25 For the set of designs d of the design space D, the system 

designer uses an evaluation function E(d) to assess each of the designs d 
in terms of the chosen performance criteria. In general, if designs are 
evaluated according to m performance criteria, the evaluation function 
E(d) maps the designs to an evaluation metric in an /77-dimensional space, 

30 wherein the /77-dimensions correspond to the performance criteria. For 
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evaluation of processor designs in which cost and execution time are the 
selected performance criteria, the evaluation function E(cO maps a design 
d to a 2-dimensional time/cost space. 

FIG. 1 illustrates a computer program 100 that carries out a 
5 programmatic method for selecting a set of potentially valid system 
designs (a validity set) of a design space D. The design space D is 
represented as a database listing all possible (valid and invalid) system 
designs, or, as a database listing system design parameters p w p 2 , . . . 
and respective parameter ranges r 1# r 2 , . . ., or a combination thereof. 

1 0 The design space D generally includes some invalid system designs 
because arbitrary combinations of valid parameter values (i.e., in the 
ranges r u r 2 , . . .) can produce system designs that are invalid . 

A design input module 103 of the program selects a set of system 
designs from the design space D by retrieving the set of system designs 

1 5 from the database D or by composing the set by selecting values for the 
parameters p v p 2 , . . . from the database D. The design input 
component 1 03 delivers the set of designs or a selected design to 
validity filters V v . . ., V n that check the system designs for validity 
based on respective validity predicates v u . . v n . The validity 

20 predicates are generally determined manually by a system designer, but 
can be produced programmatically as well. If a selected system design 
satisfies an arbitrary validity predicate V;, the validity filter Vj adds the 
selected design to a validity set S, and the sets S v . . ., S n are combined 
in a validity set S that is a union of the sets S u . . ., S n . (As shown in 

25 FIG. 1 and elsewhere herein, "U" denotes a union operator.) A selected 
design can satisfy one or more or none of the validity filters V 1# . . ., V n . 
The validity filters V u . . ., V n check design validity until all designs from 
the design space D have been checked. The validity set S then contains 
all system designs from the design space D that satisfy one or more of 

30 the validity predicates v 1f . . v n . 
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Filtering the design space D can reduce the effort required to 
select a suitable system design. For example, if the design space D 
includes 10,000 system designs and there are two validity filters V v V 2 
that each transmit 1,000 designs to the validity set U, at least 8,000 
5 invalid system designs are eliminated from further analysis. 

FIG. 2 illustrates a computer program 200 that produces a filtered 
set of system designs that is both validity filtered and quality filtered. 
System designs that satisfy one or more of the validity predicates 
v 1r . . ., v n are delivered to respective evaluation modules E (or a single 

10 evaluation module) that produce a quality metric for each system design 
based on a common evaluation function. The evaluation metrics are 
provided to a quality filter Q along with the selected design. The quality 
filter Q selects system designs satisfying one or more quality criteria (or 
quality predicates), and these selected designs are added to a quality set 

15 S\ Representative quality criteria are, for computer systems, the wafer 
area required to define associated memory and processing units, and the 
execution time required to execute a typical application program for 
which the computer system is intended. Many other quality criteria are 
possible. For some systems, the quality metric includes both wafer area 

20 and execution time and the quality filter adds only Pareto designs to the 
quality set. Pareto designs are discussed in detail below. 

Referring further to FIG. 2, the quality filter Q selects system 
designs from the quality set S 7 and produces the set S that also is a 
quality set. The system designs of the set S all satisfy at least one of 

25 the validity predicates v 1# . . ., v n , and the quality filter Q compares the 
evaluation metrics of the valid system designs corresponding to the 
various validity predicates v 17 . . v n . Some designs are removed by this 
second quality filtering because designs obtained by satisfying the 
various validity predicates. v 1f . . ., v n can eclipse each other. 
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Generally, some of the designs in the quality sets can be invalid. 
However, in many cases, a system validity predicate can be represented 
as a sum {a logical OR) of the validity predicates v u . . ., v n/ and, in such 
cases, all designs of the quality sets are valid. In addition, the system 
5 validity predicate V() and the validity predicates v v . . v n can be 

configured so that a system design that is determined to be valid by the 
validity filters V u . . ., V n is evaluated and added to the quality set S 1 
only once. Such an arrangement of validity filters is discussed below in 
terms of a specific example. 

1 0 For a system that includes a processor and a memory, an example 

validity function V() is: 

V() = ((instrSize < =64) & (n_p< =njn) & (intLitSize < =32)) II 
((instrSize < =64) & (n_p = n_m) & (memLitSize < =32)), 
wherein instrSize is an instruction length, n_p is a number of processor 

1 5 ports, n_m is a number of memory ports, and memLitSize is a length of a 
memory literal, and "&" denotes a logical AND operation and "I I" 
denotes a logical OR operation. The validity function V() can be 
decomposed into three mutually exclusive logical terms as follows. 
(Mutually exclusive logical terms are defined as logical terms only one of 

20 which can be true for arbitrary values of parameters of the terms.) The 
decomposition of the validity function V() uses the fact that a logical 
expression of the form C = A OR B can be represented as the 
disjunction (logical OR) of three mutually exclusive AND terms A AND B, 
A AND (NOT B), and (NOT A) AND B, such that C = (A AND B) OR (A 

25 AND (NOT B)) OR ((NOT A) AND B). Accordingly, the validity function 
V() can be expressed as: 

V() = Vt OR v 2 OR v 3 , wherein 

v, = ((instrSize < =64) & (n_p< =n_m) & (intLitSize < =32)) & 
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(instrSize < =64) & (n_p = n_m) & (memLitSize < =32), which 
simplifies to 

(instrSize < =64) & (n_p = n_m) & (memLitSize < =32) & 
(intLitSize < = 32); 

5 

v 2 = ((instrSize >64) OR (n_p>n_m) OR (intLitSize >32)) & 
(instrSize < =64) & (n_p = n_m) & (memLitSize < =32), which 
simplifies to 

v 2 = (instrSize < =64) & (n_p = n_m) & (memLitSize < =32) & 
10 (intLitSize) >32); and 

v 3 = ((instrSize > 64) OR (n_p<>n_m) OR (memLitSize >32)) 

& 

(instrSize < =64) & (n_p< =n_m) & (intLitSize < =32), which 
simplifies to 

15 v 3 = (instrSize < =64) & (intLitSize < =32) & ((n_p<n_m) II 

(n_p<njn) & (memLitSize 32)). 

FIG. 3 illustrates validity filtering using mutually exclusive validity 
predicates v u v 2 , v 3 . As in the examples of FIGS. 1-2, a design input 
module selects or prepares a system design or a set of system designs D 

20 and provides the designs to validity filters V v V 2 , V 3 that perform validity 
filtering based on the mutually exclusive validity predicates v v v 2 , v 3 
such as those discussed above. With such validity predicates, a valid 
system is identified as valid by only one of the validity filters V 17 V 2/ V 3 
and is added to a set of potentially valid designs only once. In addition, 

25 the designs satisfying the mutually exclusive validity predicates v 1# v 2 , v 3 
can be added to validity sets S lf S 2 , wherein the validity sets S u S 2 
correspond to the original (nonexclusive) validity predicates. In FIG. 3, 
the validity filters V lf V 2 , V 3 can be followed by evaluation components 
and quality filters prior to forming the validity sets S u S 2 . 
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Many practical systems are hierarchical and validity filtering and 
quality filtering can be carried on component design spaces instead of, or 
in addition to, filtering the system design space directly. FIG, 4 
illustrates a computer program that programmatically performs validity 
5 filtering on a hierarchical system. The design space includes component 
design spaces D u . . ., D n corresponding to the system components. A 
component design input module provides component designs or a set of 
component designs to respective component validity filters V D1/ . . V Dn . 
The component validity filters V D1 , . . V Dn determine whether a 

1 0 component design is valid based on respective component validity 
predicates v D1 , . . v Dn „ The component validity filters V D1 , . . ., V Dn 
deliver component validity sets S D1 , . . ., S Dn to a system composition 
module 403 that combines the component designs to form system 
designs. The system composition module 403 forms all combinations of 

1 5 the various component designs in the component validity sets, i.e., 
forms the Cartesian product of the component validity sets. These 
system designs satisfy the component validity predicates v D1/ . . ., v Dn 
but are not necessarily valid system designs. If the system has a validity 
predicate V() that is a product (a logical AND) of the component validity 

20 predicates v D1 , . . v Dn , then these system designs are all valid. 

Otherwise, an additional system validity filter V s can be provided, as 
shown in FIG. 5. 

FIG. 6 illustrates a computer program that performs programmatic 
validity and quality filtering of component design spaces. A component 

25 design input module (similar to that shown in FIG. 1) selects or 
generates component designs or sets of designs for components 
D u . . D n and delivers the designs to respective validity filters 
V D1 , . . ., V Dn that deliver component validity sets to respective 
evaluation modules E D1 , . . ., E Dn The evaluation modules E D1 , . . ., E Dn 

30 evaluate the component designs based on predetermined criteria 
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according to respective evaluation functions E D1 (), . . ., E Dn {), producing 
component evaluation metrics. Component quality filters Q D1/ , . . Q Dn 
receive the component designs and associated component evaluation 
metrics and implement component comparison functions. The 
5 component designs, after selection by the component quality filters 

Qdu ■ ■ / Qdp preparation of component quality sets) , are delivered 
to a composition module 603 that produces a set of system designs that 
corresponds to a Cartesian product of the component quality sets. 
These system designs are then communicated to a system evaluation 
10 module E s and a system quality filter Q s that produce a validity filtered 
quality set. 

FIG. 7 illustrates a programmatic method of obtaining a set of 
designs that is both validity filtered and quality filtered. Respective 
validity filters V D1/ . . „, V Dn produce respective component validity sets 

15 for the components D u . . ., D n . A system composer 703 forms a 
Cartesian product of the component validity sets, producing a set of 
system designs. The designs of this set are not necessarily valid, even 
though the constituent component designs are valid. A system validity 
filter V s , a system evaluation function E s , and a system quality filter Q s 

20 receive the set of system designs and produce a filtered set of system 
designs. 

In the examples of FIGS. 4-7, each of the component design 
spaces D,, , , ., D n is validity filtered, but such validity filtering can be 
omitted if all designs from a design space are known to be valid. 

25 FIGS. 8-9 illustrate computer programs that perform validity 

filtering or quality filtering (or both) on system designs composed of 
component designs D u . . . ,D n . In FIG. 8, respective common 
component validity filters C u . . . ,C n prepare component validity sets for 
respective component designs D u . . . ,D n The component validity sets 

30 are then filtered by partial component validity filters defined by partial 
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component validity predicates (V n , . . .,V 1a ), . . (V n1 , . . .,V nz ), 
respectively. As noted previously, for any component design space for 
which all designs are known to be valid, validity filtering can be omitted 
and if all system designs are known to be valid, validity filtering can be 
5 completely omitted. The resulting partial component validity sets are 
combined to form component validity sets S n , . . ., S nm . In steps 
801 t,..., 801 m Cartesian products of these sets form system design 
sets S 1# - - - ,S n that are combined to form a system validity set S. 

FIG. 9 illustrates a design selection program 901 that performs 

10 both validity filtering in a manner similar to that of FIG. 8 with additional 
quality filtering on both system designs and component designs. The 
design selection program 901 includes common component validity 
filters C v . . ., C n for respective components D 17 . . D n . The program 
901 receives component designs, design specifications, or sets of 

1 5 designs D 1f . . D n for system components based on a system 

decomposition. Generally, the program 901 uses the component design 
specifications D u . . D n to generate an exhaustive set of component 
designs but can receive component designs previously generated. The 
common component validity filters C lr . . ., C n prepare component 

20 validity sets and discard component designs determined to be invalid. 

While the common component validity filters . . ., C n can 
identify invalid component designs, not all combinations of component 
designs from the common component validity sets result in valid system 
designs, and the program 901 splits component design spaces into 

25 disjoint predicated design spaces 91 1 1f . . ., 91 1 n so that only valid 

combinations of component designs are considered. A system composer 
912 generates sets of system designs based on the valid component 
designs and the valid combinations of component designs. In a final 
combining step 913 these designs are combined to form a complete set 
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of system designs. A quality filter 91 7 then produces a quality set (such 
as a comprehensive Pareto set) and associated evaluation metrics. 

One or more of the common component validity filters C u . . ., C n 
can include a Boolean system validity function V(). The system validity 
5 function V() is conveniently expressed in a canonical OR-AND form to 
comprise an OR of one or more terms, each of the terms comprising an 
AND of one or more terms, wherein the terms within an AND are the 
smallest terms in the validity function V() that evaluate to Boolean 
values. Because any Boolean function can be reduced to canonical OR- 

10 AND form, consideration of the system validity function V() in a 
canonical form does not limit the generality of the system validity 
function V(). As an example, a system having components that include 
a processor and a memory can be specified by processor parameters 
instrSize, intLitSze and memLitSize, corresponding to instruction size, 

1 5 integer literal size, and memory literal size, respectively. In addition, the 
processor has a number n_p data access ports and the memory has a 
number n_m memory ports. A representative system validity function 
V() for this system is, in canonical form: 

V() = ((instrSize < =64) & (n_p< =n_m) & intLitSize < =32) OR 
20 ((instrSize < =64) & (n_p = n__m)&memLitSize < =32). This validity 
function includes an OR of the following two AND expressions: 

(instrSize < =64) & (n_p< =n_m) & intLitSize < =32; and 
(instrSize < =64) & (n_p = n_m) & memLitSize < =32. 
The terms in this validity function are: (instrSize < =64), (n_p< =n_m), 
25 (intLitSize < =32) and (memLitSize < =32). The term (instrSize < = 
64) is a singleton term that appears in both AND expressions and is a 
parameter of the processor only and is therefore a common term. The 
remaining terms are partial terms. 

Common terms in the validity function, such as 
30 (instrSize < = 64), are evaluated with reference to a component design 
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for a single component. The corresponding common component validity 
filter (one of the common component validity filters C v . . C n 
evaluates the term (instrSize < = 64) based on the processor design 
only, without consideration of the memory design. The terms (intLitSize 
5 < =32) and (memLitSize < =32) appear to qualify as common terms but 
do not appear in both AND expressions. Because (intLitSize < =32) does 
not appear in both AND expressions, a component design that does not 
satisfy the term (intLitSize < =32) can be an element of a validity set. 
The result of an evaluation of a validity predicate that includes a 

10 common term is TRUE (valid) only if the common term is also TRUE 

(valid). Consequently, component designs that do not satisfy a common 
term are not part of any valid system design. 

Elimination of invalid component designs simplifies system design. 
For example, if there are 100 designs each for the processor and the 

1 5 memory, and the common term (instrSize < =64) is satisfied by only 40 
of the 100 designs, and 60 processor designs are excluded by 
component validity filtering. 

Partial validity filters V n , . , V nz receive component validity sets 
produced by the respective common component validity filters 

20 C lf . . C n and use partial terms in the system validity function to 

identify and eliminate invalid component combinations, and to ensure 
that designs for different components match to reduce evaluation time 
and expense wasted on system designs known to be invalid. The partial 
validity filters V n , . . V n2 can use expansions of the partial terms of the 

25 system validity function V{). The expansion can produce singleton terms 
or additional coupled terms that can be expanded as well. Such 
expansion continues until the system validity function has only singleton 
terms and common terms, and no coupled terms. 

The coupled terms are expanded to obtain all permitted values for 

30 the coupled terms, and to replace the coupled terms with a conjunction 
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of terms corresponding to each of the permitted values. One term 
requires the expansion parameter to take on a particular value and the 
other term is a term with the expansion parameter set to the same value. 
As an example, the coupled term (n_p < = n_m) can be expanded using 
5 njD as an expansion parameter for a design space of processors having 
one or two data access ports. The substitutions n_p = 1 and n_p = 2 are 
made in the validity function, producing a logically equivalent validity 
function without coupled terms: 

V() = (instr_siz< =64) & ( ((n_p = 1) & (n_m> =1) & (inti_itSize< =32)) 
10 OR 

((n_p = 2) & (n_m> =2) & (intLitSize< =32)) & 
({n_p = 1) & (n_m = 1) & {memLitSize< =32))& 
((n_p = 2) & (n_m = 2) & (memLitSize< =32)) 
In this example, a series of equality constraints are produced with 
1 5 respect to the expanded coupled term. Other expansions of coupled 

terms are possible, but every permitted value that the coupled term can 
assume for designs in the component design space should satisfy at 
least one of the expanded terms. For example, the term n_p < = n_m 
can be expanded to include n_p < = 1 and n_p > =2. In general, 
20 expansions that reduce or eliminate coupled terms simplify design 
evaluation. 

The expanded form of the system validity function V() is used by 
the partial validity splitters V n . . V nz to determine a set of partial 
validity predicates for the component design spaces. The partial validity 
25 predicates are formed by scanning the AND terms in the system validity 
function V() and collecting all unique combinations of terms involving a 
component, in the above example, the partial validity predicates for the 
memory are: 

(n_m> =1), 
30 (n m> =2), 
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(n_m = 1), 
(n_m = 2), 

and the partial validity predicates for the processor are: 
n_p = 1 & intLitSize < =32, 
5 n_p = 2 & intLitSize < = 32, 

n_p = 1 & memLitSize < =32, 
n_p = 2 & memLitSize < =32. 

Predicated component design spaces 91 1 91 1 n can be 
formed based on the partial validity predicates. In the example discussed 

1 0 previously, the valid designs identified by the common component 

validity filters Cj . . C n includes the 40 processor designs that satisfy 
(instrSize< =64). Four smaller predicated design spaces can be formed, 
each satisfying one of the four processor partial validity predicates listed 
above. If a processor design can satisfy both (intLitSize < =32) and a 

1 5 (memLitSize < =32), then the predicated design spaces are not disjoint 
and a design can belong to more than one predicated design space. 

The system composer 912 combines the component designs from 
the predicated design spaces 91 1 v - . 91 1 n to produce system 
designs that are combined in a union operation 913. The system 

20 composer 912 iterates over the AND expressions in the expanded 
system validity function V() and splits the AND expression into sub- 
expressions each involving parameters from a particular component. 
Each sub-expression corresponds to a partial validity predicate and one 
of the predicated design spaces 91 1 u . . 91 1 n . The system composer 

25 912 picks corresponding predicated design spaces, one for each of the 
components, and takes the Cartesian product of the predicated design 
spaces 91 1 u . . 91 1 n/ producing a set of system designs. 

After the system composer 913 produces the set of system 
designs, a system quality filter 917 receives the system validity set and 

30 produces, for example, a Pareto curve or a Pareto set for the system. 
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The quality filter 917 receives system designs after several stages of 
validity filtering and thus, identifies quality designs from valid designs. 
Without prior validity filtering, the quality filter can identity invalid quality 
designs without identifying any valid designs. 
5 FIG. 10 illustrates a method similar to that of FIG. 9 that is 

typically more efficient. In FIG. 10, full Cartesian products of 
component quality sets are not constructed. Instead, partial cartesian 
products (denoted as "X p ") are formed, eliminating some system designs 
from further consideration. Such system designs are eliminated by 

1 0 considering system designs that are currently members of the system 
quality set and by finding lower bounds on the evaluation metrics of the 
eliminated systems. This procedure is applicable when the 
decomposition is monotonic. 

Prior to forming the partial Cartesian products, the component 

1 5 quality filters Q 1f . . ., Q n find the lowest values for each of the 

evaluation metric of the component quality sets. As the Cartesian 
product X p is formed, full system designs are produced by combining 
component designs. After a subset of component designs is selected, 
the respective evaluation metrics are used in conjunction with the best 

20 values of the evaluation metrics of the unselected components to obtain 
(using the monotonicity property) lower bounds on the evaluation 
metrics of any system design that includes selected components. The 
lower bound is then compared with the system designs in the partially 
completed system quality set. If the lower bound is eclipsed by any 

25 system in this set, then the partial Cartesian product module does not 
combine these components to produce system designs because such 
designs are known to be eclipsed. 

Other combinations of validity filtering and quality filtering are 
obtained by combining the methods illustrated in FIGS. 1-10 and noting 
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that components of a system are frequently decomposable into 
(sub)components. 

Quality filtering generally produces a Pareto set or an 
approximation to a Pareto set. One or more evaluation functions E(d) 
produce evaluation metrics that permit comparison of various designs. 
For convenience, quality filtering is further described below with respect 
to a two-dimensional quality metric (such as cost and execution time for 
a processor system), and with reference to processor system design. 

FIG. 1 1 shows a mapping of designs d^ d 2 , . . ., d n into the two 
dimensional time/cost space. While the mapping of FIG. 1 1 appears 
straightforward, the actual computation of E(d) for each of the designs 
</ 1f d 2 , . . d n can be expensive and time-consuming, requiring 
simulation of each the designs and evaluation of the design time based 
on the benchmark application. Because the computation of E(d) is 
expensive and slow, the design space DS is generally not fully explored 
(i.e., for some designs E{d) is not evaluated) and a design is selected 
without evaluating all the available designs. Reducing the number of 
designs d to be evaluated (by validity or quality filtering or a combination 
thereof) significantly reduces the difficulty of identifying a preferred 
design. 

The evaluation function E(d) permits determination of superior 
designs by inspecting the mapping of the designs to the m-dimensional 
performance criteria space. If the evaluation function E(d) maps designs 
d if d k to respective m-dimensional coordinates (e' 0 , . . e'^), (e k 0 , . . ., 
e k m _,) t then the design d k is said to "eclipse" the design d x if the design d k 
is superior or equivalent to d x in at least one evaluation criterion (and no 
worse in all other criteria), that is, if e k , < e\ for at least one value of j 
and e k j < e\ for all other values. The /77-dimensionai coordinate 
associated with a design d is referred to as a "design point," or simply as 
a design. Because the coordinates e ] correspond to cost, time, or other 
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performance criteria that are preferably minimized, the design d k that 
eclipses the design d, is either cheaper, quicker, or in some other fashion 
superior to the design d r In some cases, some (or all) of the coordinates 
e l of competing designs are equal. If e*. < e\ for all 1 < j < m, the design 
5 d k is said to "weakly" eclipse the design d, (i.e., the design d k is not 
inferior to the design d). 

In FIG. 1 1, the design d^ is shown along with an eclipsing region 
1 101 of the design d^. The design d 2 is within the eclipsing region 
1 101, and is eclipsed by the design d } . As is apparent from FIG. 1 1, the 

1 0 design d A has both a lower cost and a shorter execution time and is 

therefore superior to design d 2 . Referring to the design c/ 3 , an eclipsing 
region 1 103 of the design d 3 is illustrated. The eclipsing region of any 
design d y is defined as a region in the design space for which coordinate 
values e t are greater than the coordinate values e 1 j of the design </ v In 

15 FIG. 1 1, the eclipsing regions 1 101, 1 103 are quarter planes extending 
in the positive time and cost directions. 

A goal of processor system design or processor subsystem design 
(for example, design of a cache memory) is to identify designs with low 
execution times and costs, i.e., designs that eclipse other designs. A 

20 design d p is referred to as a "Pareto' 1 design if it is not eclipsed by any 
other design. A comprehensive Pareto set is defined as the set P p of all 
the Pareto designs d p . For some systems, the evaluation function E(d) 
maps several designs to the same coordinates. A Pareto set P sp is a 
subset of the comprehensive Pareto set P p that includes at least one of 

25 the Pareto designs that have the same coordinates. The eclipsing region 
of a Pareto set is a union of all the eclipsing regions of the Pareto 
designs. All designs that fall within the eclipsed region of a Pareto set 
P sp are eclipsed by one or more designs in the Pareto set P sp . A Pareto 
surface (a curve in a 2-dimensional space) partitions the eclipsing region 

30 of a Pareto set from the rest of the /^-dimensional space. For the 2- 
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dimensional mapping of FIG. 1 1, the Pareto surface is a 2-climensional 
curve defined by a union of all the eclipsing regions (quarter planes). 
Thus, the Pareto curve is a set of alternating horizontal and vertical line 
segments connecting the coordinates of the Pareto designs. 
5 A quality set can also be an approximation to the Pareto set. For 

example, the evaluation metrics can be calculated with reduced accuracy 
to simplify the evaluation function. In this case, it is difficult to 
determine if designs are Pareto designs. Designs that have evaluation 
metrics that are equal within a range dependent on the inaccuracy in the 

1 0 computation of the evaluation metrics appear equivalent and can be 
retained in a quality set. In other cases, increased design freedom can 
be achieved by adding known non-Pareto designs to a quality set. The 
additional designs are generally close to Pareto designs. 

Given a Pareto curve or a comprehensive Pareto set, a design can 

15 be selected programmatically to achieve a predetermined cost or time, or 
combination of cost and time. Using the Pareto curve (or the 
comprehensive Pareto set), superior designs are not overlooked. 
However, construction of the Pareto curve and the comprehensive 
Pareto set by exhaustively evaluating all possible designs is generally 

20 infeasible due to the large number of design variables available as well as 
the complexity of evaluating a particular design. As shown in, for 
example, FIGS. 4-8, a processor system or other system of interest can 
be divided into components and a component design spaces can be 
quality filtered (i.e., Pareto filtered) to produce component quality sets 

25 that are component Pareto sets. Combining the component Pareto 

curves or sets constructs a comprehensive Pareto curve or Pareto set for 
the system For example, a system design d is a composition of 
component designs d 1 , d 2 r , „ dP, and a set of system designs is 
obtained from the Cartesian product of sets of component designs, i.e., 

30 the set of systems designs is the set of all combinations of the 



HP1 0990408-1 29 

Express Mail No. EM295378042US 

component designs. The program can also determine the validity of a 
component design or a combination of component designs, as described 
previously. 

If the cost and execution time (or other selected performance 
criteria) of a system are monotonically non-decreasing functions, 
replacing a component with a cheaper (faster) component makes the 
system cheaper (faster). In this case, the comprehensive set of designs 
obtained from the component Pareto sets can include some non-Pareto 
designs but includes all the designs of the comprehensive Pareto set. If 
cost and execution time are generally, but not always, monotonically 
non-decreasing functions, the comprehensive set of designs obtained 
from the component Pareto sets may contain non-Pareto designs and 
may lack some Pareto designs. However, the designs included in this 
comprehensive set can approximate the Pareto designs, and a near- 
Pareto design can be selected from this set. Such a set of designs is 
also a quality set. 

The evaluation of a design d depends on the manner in which the 
performance criteria for the components are combined. For a sequential 
system, the total value of a selected performance criterion is the sum of 
the corresponding values for the components. An example of such a 
system is a system that combines a processor and a cache memory. In 
such a system, the processor is either busy or waiting for the cache and 
the total execution time is the sum of the times associated with the 
processor and the cache. The total cost is the sum of the costs of the 
components. In a parallel system, all (or many) components of the 
system are busy simultaneously, and the execution time is the maximum 
of the execution times for each of the components while the cost is the 
sum of the component costs. In many systems, no such simple 
evaluation of system designs based on component designs is possible. 



HP1 0990408-1 30 

Express Mail No. EM295378042US 

For some such systems, system evaluation is individually performed for 
each system design. 

System components can be independent in that the components 
do not interact with respect to cost or execution time. For such a 
decomposition, a single Pareto curve (or comprehensive Pareto set) for 
each of the components is sufficient for preparation of a Pareto curve or 
a comprehensive Pareto set for the system. In other cases, the 
components interact and one or more Pareto curves for each component 
can be necessary. For example, component of systems having validity 
predicates that contain one or more coupled terms interact and 
consideration must be given to valid combinations as all combinations of 
valid components are not valid. 

An example system having interacting components is a processor 
system that includes a processor and a cache that communicate with n 
ports. For this system, component Paretos are prepared for processors 
and caches having various numbers n of ports. A combined Pareto is 
obtained by combining processor and cache Paretos having the same 
number of ports. Because the processor and cache are matched with 
respect to the number of ports, the designs of the combined Pareto 
curve or Pareto set correspond to actual system designs. Interactions 
such as this affect the validity of a system design that is a combination 
of component designs. 

In some cases, the evaluation function E(d) is only an 
approximation. For such cases, some non-Pareto designs can be 
included in a quality set because of the uncertainty in E(c/). If a bound 
on the inaccuracy of E(d) is known, then some designs obtained by 
combining component designs from the component Pareto sets can be 
eliminated by showing that these designs have higher costs or longer 
execution times than some other designs. Such designs can be excluded 
from the comprehensive Pareto set. 
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In some systems, the cost, execution time, or other performance 
criteria of one system component depends upon one or more features of 
another system component. For example, the number of stall cycles 
caused by a miss in a first level cache depends on the number of misses 
5 in the first level cache and the miss penalty of the first level cache. The 
miss penalty of the first level cache depends on the time required to 
access a second level cache or main memory. This access time is 
generally known only when first level cache and a second level cache 
designs are combined. 

1 0 The comprehensive Pareto set produced by combining component 

Pareto sets can also serve as a component Pareto set for a higher level 
system. For example, the comprehensive Pareto set for a cache memory 
obtained by combining component designs for a first level cache and a 
second level cache not only permits selection of a Pareto cache design, 

1 5 but serves as a component Pareto set for a processor system that 
includes such a cache memory as a component. 

FIG. 1 2 is a block diagram of a processor system 1 200 used to 
illustrate processor system design and cache memory design using 
component Pareto curves or component Pareto sets as described above. 

20 The processor system 1 200 includes a very long instruction word (VLIW) 
processor 1201, a systolic array 1203, and a cache memory 1205. The 
cache memory 1 205 is a so-called "split" cache and includes a first level 
cache L1 that has an instruction cache (i-cache) 1209 and a data cache 
(d-cache) 1207, and a second level cache L2 comprising a unified cache 

25 (u-cache) 1211. (The i-cache 1 209, d-cache 1 207, and the u-cache 

121 1 are referred to below as "cache components.") The i-cache 1209 
communicates with the processor 1201 via an instruction port 1213; the 
d-cache 1207 communicates with the VLIW processor 1201 via one or 
more data ports 1 21 5. The u-cache 1211 communicates with the 

30 systolic array 1203 via one or more systolic ports 1217. The u-cache 
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1211 also includes one or more u-cache ports 1219 for communication 
with the i-cache 1 207 and the d-cache 1 209 and can include a bypass 
port 1 221 for communicating directly with the processor 1 201 . If the 
bypass port 1221 is enabled, the number of u-cache ports 1 21 9 is the 
5 maximum of the number of data ports 1215 and the number of systolic 
ports 1217. If the bypass port 1221 is disabled, the maximum number 
of u-cache ports 1219 is the maximum of 1 and the number of the 
systolic ports 1217. 

The i-cache 1 209 provides storage for instructions for the VLIW 

10 processor 1201; if the i-cache 1209 does not contain an instruction 

requested by the VLIW processor 1201, then the i-cache 1209 attempts 
to retrieve the instruction from the u-cache 1211. Similarly, if the d~ 
cache 1 207 contains data requested by the VLIW processor 1 201 , the 
data is retrieved directly from the d-cache 1 207. If not, then the d- 

1 5 cache 1 207 attempts to retrieve the data from the u-cache 1211. If the 
requested data or instruction is not found in the u-cache 121 1, then the 
u-cache 121 1 requests the data from conventional memory (RAM or 
ROM). 

The processor 1 200 can be considered to be a system formed of 
20 three components, the VLIW processor 1 201 , the systolic processor 
1203, and the cache 1205. Each of these components has an 
associated design space, and a processor design space can be quality 
filtered and validity filtered as shown in FIGS. 4-9. In addition, the 
cache 1 205 can be considered to be a system formed of three 
25 components, the d-cache 1 207, the i-cache 1 209, and the u-cache 

1211. Thus, the cache 1 205 is a component of a system and a system 
formed of components and the design space of the processor 1 200 is a 
hierarchical design space of at least two levels. 

As a first example of quality filtering using component Pareto 
30 curves or component Pareto sets, the design of the cache memory 1 205 
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is illustrated using component Pareto curves for the i-cache 1 209, d- 
cache 1 207, and u-cache 1211. As discussed above, cost and 
execution time are the selected performance criteria. This and other 
examples are described using Pareto curves to graphically represent the 
5 quality sets, but either Pareto curves or Pareto sets can be used. In 
addition, Pareto curves are generally indicated as smooth curves 
connecting the Pareto design points. 

RAM and ROM can also be included in the design selection 
process. The design of the cache memory 1 205 includes selection of 

1 0 total cache memory size (the sum of the memory sizes for the cache 
components, i.e., the i-cache 1209, d-cache 1207, and u-cache 121 1), 
the allocation of memory to each of the components, and other 
parameters discussed below. To evaluate the designs (i.e., compute 
E{d)) f a representative design for the VLIW processor 1201 is selected 

1 5 and the execution time is based upon the execution time of a benchmark 
application program (GHOSTSCRIPT) on a predetermined input data file. 
GHOSTSCRIPT is a widely available application program that converts 
document files from a POSTSCRIPT format into formats suitable for 
printers that are unable to interpret POSTSCRIPT. A benchmark input 

20 file is provided so that the benchmark application processes the same 
data in evaluating each design. 

The execution time of the i-cache 1 209 and d-cache 1 207 (the 
first level cache L1) depend on the design of the u-cache 1211. Initially, 
the design times for the i-cache 1209, d-cache 1207, and u-cache 121 1 

25 are expressed as cache misses, i.e., the number of times data requested 
from a cache component is not available in the cache component. The 
actual execution time associated with a first level L1 cache miss 
depends on the number of access cycles required to access the u-cache 
1211. The execution time associated with a second level L2 cache miss 

30 depends on the time required to access main memory. The probability of 



HP1 0990408-1 34 

Express Mail No. EM295378042US 

a cache miss in a cache component depends on the size of the cache 
component. In evaluating the cache memory 1 205 or the cache 
components (the i-cache 1 209, the d-cache 1 207, and the u-cache 
121 1), the number of times requested data or instructions are not in the 
5 d-cache 1207, the i-cache 1209, or the u-cache 121 1 is obtained based 
upon the simulated execution of the GHOSTSCRIPT application program. 

The cache components can be configured in several ways. The 
cache components can be divided into memory banks (sometimes 
referred to as "ways") with the ways being further divided into "lines." 

10 Lines are the smallest independently addressable memory blocks in the 
cache. The cache components can use any of several hashing 
algorithms for determining a cache location for storing data from a 
particular main memory location. If data from any main memory address 
can be replicated anywhere in the cache , then the cache is referred to 

15 as a fully-associative cache. A cache divided into N memory banks such 
that data from any main memory address can be replicated in any of the 
N memory banks is referred to as an N-way set-associative cache. A 1- 
way set-associative cache is generally referred to as a direct-mapped 
cache. An N-way set-associative cache is said to have an "associativity" 

20 of N. 

In the cache design example described below, the line sizes for the 
d-cache 1207, i-cache 1209, and u-cache 121 1 are fixed at 16 bytes, 
32 bytes, and 32 bytes, respectively. In the design process, the d-cache 
1 209 is assumed to be a direct mapped cache, while the designs of the 

25 i-cache 1 209 and u-cache 1 207 are considered having associativities of 
1 , 2 and 2, 4, respectively. In other cache designs, these parameters 
can be allowed to vary or take on additional values. The memory sizes 
and line sizes of the cache components are restricted to powers of 2. 
Each of the cache components is evaluated individually. The d- 

30 cache 1 207 is evaluated as a function of cache size only, as a direct 
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mapped cache with a line size of 16 bytes. FIG. 13 contains a Pareto 
curve 301 for the d-cache 1207 for cache sizes of 2, 4, 8, and 16 kB. 
FIG. 13 also shows Pareto designs 1303, 1305, 1307, 1309 for the d- 
cache 207. The Pareto curve 1301 is graphed with design execution 
5 time (d-cache misses N d ) on a vertical axis 131 1 and cache cost (wafer 
area) on a horizontal axis 1313. An approximate Pareto curve 1315 
connects the Pareto designs 1303, 1305, 1307, 1309. 

The line size of the i-cache 1209 is fixed at 32 bytes. The size of 
the i-cache 1 209 ranges from 2 kB to 64 kB and associativities of 1 and 

10 2 are considered. The costs and execution times for these combinations 
of size and associativity are determined based on the number of cache 
misses in the i-cache 1 207 as a function of cache size based on the 
simulated execution of the GHOSTSCRIPT application with a 
predetermined design of the VLIW processor 1201 . FIG. 14 contains a 

15 Pareto curve 1401 for the i-cache 1209 that is plotted with respect to 
coordinate axes 1405, 1407 corresponding to execution time (i.e., i- 
cache misses H) and cost, respectively. FIG. 14 also shows Pareto 
design points 1403 as well as non-Pareto design points 1409. The 
Pareto curve 1401 eclipses the non-Pareto design points 1409. As 

20 discussed above, the execution time is determined as a number of i- 
cache misses, i.e., the number of times the VLIW processor 1 201 is 
unable to retrieve the requested instruction directly from the i-cache 
1207 while executing the GHOSTCRIPT application. 

For both the d-cache 1 207 and the i-cache 1 209, the actual 

25 execution time depends on the design of the u-cache 1211. Design of 
the u-cache 1211 is considered next. Design variables for the u-cache 
1211 considered in this design example include cache size (64 kB to 2 
MB) and associativities (2 and 4). The u-cache 121 1 communicates 
with main memory via a system bus and requires a main memory cycle 

30 time t main to retrieve data from main memory. The u-cache designs 
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considered require an access time (t access ) that is equivalent to 3-7 
processor clock cycles to a supply not found in the i-cache 1 207 or the 
d-cache 1209. FIG. 15 contains a component Pareto curve 1501 for the 
u-cache 1211 and Pareto design points 1503, 1505, 1507, 1509, 1511 
5 that correspond to access times of 3, 4, 5, 6, and 7 processor clock 

cycles, respectively. FIG. 15 also shows non-Pareto design points 1513. 
For convenience, the Pareto curve 1501 is shown as a smooth curve 
connecting the Pareto design points. 

FIG. 16 contains a combined Pareto curve 1601 obtained from the 

1 0 component Pareto curves 1 301 , 1 401 , 1 501 . To obtain the combined 
Pareto curve 1601, a Pareto design point is selected from each of the 
Pareto curves 1301, 1401, 1501 and the corresponding costs and the 
execution times are summed. The costs are summed directly. The 
design execution time is obtained as the sum (N d 4- Nj)*t access + N u *t main . 

1 5 As shown in FIG. 16, the design execution time is conveniently 

expressed in terms of stall cycles, i.e., the number of processor clock 
cycles for which the VLIW processor 1 201 waits for the necessary 
instruction or data to be retrieved. Inspection of FIG. 16 permits 
selection of a cache design based on cost and design execution time. 

20 There are no designs superior to (i.e., which eclipse) the designs of FIG. 
16 and selection of a design from FIG. 16 permits selecting a preferred 
combination of cost and execution time. Alternatively, a cache design 
can be selected based on a combined Pareto set (the design points that 
define the combined Pareto curve 1601), instead of the graphical 

25 representation of the Pareto set. 

A design for a combination of the VLIW processor 1 201 and the 
cache memory 1 205 can similarly be selected using component Pareto 
curves. First, component Pareto curves are obtained for the VLIW 
processor 1201 and the cache memory 1205. FIG. 16 contains the 

30 combined Pareto curve 1601 for the cache memory 1205. The Pareto 
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curve 1 601 serves as a component Pareto curve for the VLIW 
processor/cache memory system. A component Pareto curve for the 
VLIW processor 1201 is prepared as described above and is shown as a 
curve 1801 in FIG. 18. Execution time (number of VLIW processor 
5 cycles) is graphed along a vertical axis 1 803 and cost (area) is graphed 
along a horizontal axis 1805. FIG. 18 also shows VLIW processor 
Pareto design points 1807. 

FIG. 19 contains a combined Pareto curve 1901 obtained with the 
Pareto curves 1601, 1801 of FIGS. 16, 18, respectively. Pareto design 

1 0 points 1 903 are obtained by selecting a Pareto design point from both 
the Pareto curves 1601, 1801 and summing the costs (areas) and 
execution times. 

As yet another example of design selection using component 
Pareto sets or curves to form a comprehensive Pareto set, a design for 

1 5 the VLIW processor system 1 200 of FIG. 1 2 can be selected using 
component Pareto sets or curves for the VLIW processor 1 201 , the 
systolic array 1 203, and the combined cache 1 205 to prepare a 
combined Pareto set. As in the previous examples, the performance 
criteria are cost and execution time. FIG. 17 contains graphs of the 

20 component Pareto curves. In this example, VLIW processor designs are 
considered having various numbers of data ports for communication with 
the d-cache 1209. A graph 1701 of component Pareto curves for the 
VLIW processor 1201 includes curves 1703, 1705 that represent 
component Pareto curves for different numbers of d-ports. Similarly, a 

25 graph 1711 of component Pareto curves for the systolic array 203 

includes component Pareto curves 1713, 1715 for different numbers of 
systolic ports 1217. 

Component Pareto curves are also prepared for the i-cache 1 209, 
d-cache 1 207, and the u-cache 1211. A graph 1 721 contains a 

30 component Pareto curve 1723 for the i-cache 1 209 and is prepared as 
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described above. A graph 1731 contains component Pareto curves 
1 733, 1 735 for the d-cache 1 209, the curves 1 733, 1 735 
corresponding to different numbers of data ports 1215. While only two 
curves 1733, 1735 are shown, additional numbers of data ports 1215 
5 can be considered. The execution time of the d-cache 1 209 is 
independent of the number of data ports 1215, but cost is not. 
Similarly, a graph 1741 contains component Pareto curves 1743, 1745 
for the u-cache 211 corresponding to different numbers of u-cache ports 
1219. The component Pareto curves corresponding to the d-cache 

10 1209, the i-cache 1207 and the u-cache 121 1 are combined to produce 
comprehensive Pareto curves 1751, 1753 corresponding to different 
numbers of data ports 1215 and u-cache ports 1219. The combined 
Pareto curves 1751, 1753 are component Pareto curves with respect to 
the processor system 1200. 

15 A combined Pareto curve 1761 is then prepared from the 

component Pareto curves 1703, 1705 (for the VLIW processor 1201), 
1713, 1715 (for the systolic array 1203), and 1751, 1753 (for the 
cache memory 1205). In preparing the combined Pareto curves (or 
sets), only designs having equal numbers of data ports 1215 for both the 

20 VLIW processor 1201 and the d-cache 1209 are combined. 

Combinations of component Pareto designs in which the numbers of d- 
ports 1215, u-ports 1219, or other interconnection parameters are 
unmatched are not used in preparing the combined Pareto curve 1761 . 
In the above design examples, the selected performance criteria 

25 are execution time and cost. Additional design variables such as dilation 
or power consumption can be considered in finding the component 
Pareto sets. These additional performance criteria can be considered 
along with execution time and cost, or other combinations of 
performance criteria. 
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Having illustrated and demonstrated the principles of the invention 
in example embodiments, it should be apparent to those skilled in the art 
that the embodiments can be modified in arrangement and detail without 
departing from such principles. We claim as the invention all that comes 
within the scope of the following claims. 



HP1 0990408-1 
What is claimed is: 



40 

Express Mail No. EM295378042US 



1 . A method of programmatically selecting system designs from a 
system design space, the method comprising: 

5 specifying system designs as combinations of component designs 

from respective component design spaces; 

applying component quality filters to the component design spaces 
to produce component quality sets of designs; and 

forming a Cartesian product of the component quality sets to 
10 obtain a set of system designs, 

2. The method of claim 1 , further comprising applying component 
validity filters to respective component design spaces before applying the 
component quality filters, wherein the component quality sets of designs 

1 5 include only designs satisfying respective component validity filters. 

3. The method of claim 1, further comprising applying a system 
validity filter to the set of system designs to produce a validity filtered 
set of system designs. 

20 

4. The method of claim 3, further comprising applying a system 
quality filter to the set of system designs. 

5. The method of claim 1 , further comprising applying a system 
25 quality filter to the set of system designs. 

6. A method of programmatically selecting system designs that 
are specified by combinations of component designs, the method 
comprising: 
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preparing component validity sets for each of the component 
designs by applying component validity filters to corresponding 
component design spaces, the component validity filters defined by 
corresponding component validity predicates; and 
5 forming a set of system designs that is a Cartesian product of the 

component validity sets. 

7. The method of claim 6, wherein the component designs are 
specified by component parameters, and the component validity filter for 

1 0 each component is independent of the component parameters of other 
components. 

8. The method of claim 6, further comprising applying a system 
validity filter to the Cartesian product of the component validity sets. 

15 

9. The method of claim 6, further comprising applying a system 
quality filter to the Cartesian product of the component validity sets. 

10. The method of claim 6, further comprising applying a system 
20 evaluation function and a system quality filter to the Cartesian product of 

the component validity sets after applying a system validity filter. 

1 1 . The method of claim 1 0, further comprising applying a 
component evaluation function and a component quality filter to the 

25 component validity sets- 

1 2. The method of claim 6, further comprising applying a 
component evaluation function and a component quality filter to at least 
one of the component validity sets before forming the set of system 
30 designs. 
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13. The method of claim 12, further comprising: 

selecting a partial system design that includes component designs 
for at least one component; 
5 obtaining a lower bound for an evaluation metric for a system 

design, wherein the system design includes the partial system design; 
and 

comparing an evaluation metric of a system that includes the 
partial system design to the lower bound. 

10 

1 4. A method of selecting system designs that are specified by 
combinations of component designs, the method comprising: 

preparing component validity sets for each of the component 
designs by applying component validity filters to corresponding 
1 5 component designs, the component validity filters defined by 
corresponding component validity predicates; 

preparing component quality sets by applying corresponding 
component evaluation functions and component quality filters to the 
component validity sets; and 
20 forming a set of system designs that is a Cartesian product of the 

component quality sets. 

15. The method of claim 14, further comprising applying a 
system validity filter to the Cartesian product of the component quality 

25 sets. 

16. The method of claim 14, further comprising applying a 
system evaluation function and a system quality filter to the Cartesian 
product of the component quality sets. 

30 
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1 7. The method of claim 1 6, wherein the component evaluation 
functions and the system evaluation function produce component 
evaluation metrics and system evaluation metrics, respectively, and the 
system evaluation metrics are obtained from the component evaluation 
5 metrics. 

18. A computer readable medium comprising computer 
executable instructions for performing the method of claim 1 . 

10 19. A computer readable medium comprising computer 

executable instructions for performing the method of claim 6. 

20. A computer readable medium comprising computer 
executable instructions for performing the method of claim 14. 

15 

21 . A method of programmatically selecting a system design 
from a set of system designs, comprising: 

defining a system validity predicate that is a function of two or 
more terms; 

20 defining partial validity predicates by expressing the system 

validity predicate in a canonical form; 

applying partial validity filters that are defined by the partial 
validity predicates to the system designs to obtain partial validity sets; 
and 

25 combining the designs from the partial validity sets to obtain sets 

of designs satisfying each of the two or more terms. 

22. The method of claim 21 , where each of the partial validity 
predicates is in product form. 

30 
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23. The method of claim 21 , wherein the partial validity 
predicates are mutually exclusive. 

24. A method of programmatically selecting a set of system 
5 designs, comprising: 

selecting a system validity filter defined by a system validity 
predicate, the system validity predicate including one or more partial 
validity predicates that define partial validity filters; 

applying the partial validity filters to the system designs; 
1 0 forming partial validity sets that include system designs satisfying 

respective partial validity filters; 

applying an evaluation function to the system designs of the 
partial validity sets, the evaluation function producing an evaluation 
metric for each system design; 
1 5 applying a quality filter to the system designs of the partial validity 

sets, the quality filter comparing and selecting system designs based on 
the evaluation metrics and producing respective partial quality sets; and 
combining the partial quality sets to form a first quality set. 

20 25. The method of claim 24, further comprising applying the 

quality filter to the first quality set. 

26. The method of claim 24, wherein each of the partial validity 
predicates is in product form. 

25 

27. The method of claim 26, wherein the system validity 
predicate is a product of the partial validity predicates. 



28. The method of claim 26, wherein the partial validity sets are 
30 combined to form two or more system validity sets. 



HP1 0990408-1 



45 

Express Mail No. EM295378042US 



29. A computer readable medium having computer exectuable 
instructions for performing the method of claim 24. 

5 30. A computer readable medium having software for performing 

the method of claim 25. 

31 . A method of programmatically selecting a design for a cache 
memory, comprising: 

1 0 selecting components for the cache memory; 

determining component Pareto sets for the components; 
preparing a combined Pareto set from the component Pareto sets; 

and 

selecting a cache memory design from the combined Pareto set. 

15 

32. A method of selecting a design for a processor system, the 
processor system including a processor and a cache memory, the 
method comprising: 

preparing a component Pareto set for the processor; 
20 preparing a component Pareto set for a cache memory; 

preparing a combined Pareto set from the component Pareto sets 
of the processor and the cache memory; and 

selecting a processor system design from the combined Pareto 

set. 

25 

33. A method of programmatically generating a set of designs for 
a processor system, comprising: 

dividing the processor system into at least a processor component 
and a memory component; 
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preparing component validity sets for the processor component 
and the memory component; 

forming a Cartesian product of the component validity sets to 
produce a processor system validity set. 



34. The method of claim 33, further comprising expressing the 
system validity function in a logical canonical form. 

10 35. A method of designing a processor system that includes a 

processor component and a memory component, comprising: 

determining component validity sets for the processor component 
and the memory component; 

dividing at least one of the component validity sets into subsets; 

1 5 and 

generating sets of system designs by combining component 
designs from the component validity sets and the subsets. 



36. A method of generating a set of partial validity predicates for 
20 a system design that includes component designs for at least a first 
component and a second component, the method comprising: 

obtaining a system validity function defined by a system validity 
predicate; and 

identifying coupled terms in the system validity predicate, the 
25 coupled terms including parameters of the components. 



37. The method of claim 36, wherein the system design is 
processor system design and the components include a processor 
component and a memory component. 

30 
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38. The method of claim 37, further comprising expanding the 
coupled terms to obtain singleton terms containing parameters of only 
the processor component and singleton terms containing parameters of 
only the memory component. 

39. The method of claim 36, further comprising expanding the 
coupled terms to obtain singleton terms containing parameters of only a 
first component and singleton terms containing parameters of only a 
second component. 

40. The method of claim 39, further comprising expressing the 
system validity predicate in canonical form. 



41 . The method of claim 36, further comprising expressing the 
1 5 system validity predicate in canonical form. 
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PROGRAMMATIC DESIGN SPACE EXPLORATION THROUGH VALIDITY 
FILTERING AND QUALITY FILTERING 

Abstract 

Design spaces for systems, including hierarchical systems, are 
programmatically validity filtered and quality filtered to produce validity 
sets and quality sets, reducing the number of designs to be evaluated in 
selecting a system design for a particular application. Validity filters and 
quality filters are applied to both system designs and component 
designs. Component validity sets are combined as Cartesian products to 
form system validity sets that can be further validity filtered. Validity 
filters are defined by validity predicates that are functions of discrete 
system parameters and that evaluate as TRUE for potentially valid 
systems. For some hierarchical systems, the system validity predicate is 
a product of component validity predicates. Quality filters use an 
evaluation metric produced by an evaluation function that permits 
comparing designs and preparing a quality set of selected designs. In 
some cases, the quality set is a Pareto set or an approximation thereof. 
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below: 



APPLICATION SERIAL NUMBER 



FILING DATE 



U. S. Priority Claim 

I hereby claim the benefit under Title 35, United States Code, Section 120 of any United States application(s) listed below and, 
insofar as the subject matter of each of the claims of this application is not disclosed in the prior United States application in the 
manner provided by the first paragraph of Title 35, United States Code Section 1 1 2, I acknowledge the duty to disclose material 
information as defined in Title 37, Code of Federal Regulations, Section 1 .56(a) which occurred between the filing date of the prior 
application and the national or PCT international filing date of this application: 



APPLICATION SERIAL NUMBER 



FILING DATE 



STATUS [patented/pending/abandoned) 



POWER OF ATTORNEY: 

As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this application and transact all 
business in the Patent and Trademark Office connected therewith: 



Customer Number 



022879 



Place Customer 
Number Bar Code 
Label here 



Send Correspondence to: 
HEWLETT-PACKARD COMPANY 
Intellectual Property Administration 
P.O. Box 272400 

Fort Collins, Colorado 80528-9599 



Direct Telephone Calls To: 

Michael D. Jones 
(503) 226-7391 



I hereby declare that all statements made herein of my own knowledge are true and that all statements 
made on information and belief are believed to be true; and further that these statements were made 
with the knowledge that willful false statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that such willful 
false statements may jeopardize the validity of the application or any patent issued thereon. 



Full Name of inventor: Bantwal Ramakrishna Rau 

Residence: Los Altos, CA 94024 

Post Office Address: 900 Highlands Circle, Los Altos, CA 94024 



Citizenship: U.S.A. 



Inventor's Signature 

Rev 1 1/99 IDecPwr} 



Date 

(Use Page Two For Additional fnventor(s) Stgnature(s)) 
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pECLA RATION AND POWER OF ATTORNEY 
FOR PATENT APPLICATION (continued) 



ATTORNEY DOCKET NO. HP1 0990408-1 



Full Name of # 2 joint inventor: S antOSh G. Abraham 
Residence: 
Post Office Address: 



Citizenship: Indian 



Pleasanton, CA 



4776 Amanda Place, Pleasanton, CA 94566 



Inventor s signature 



Date 



Full Name of # 3 joint inventor: Robert Schre iber 
Residence: 
Post Office Address: 



Citizenship: U.S.A. 



Palo Alto, CA 



183 Creekside Drive, Palo Alto, CA 94306 



Inventor' s Signature 



Date 



Full Name of # 4 joint inventor: 

Residence: 

Post Office Address: 



Inventor's Signature 



Citizenship: 



Date 



Full Name of # 5 joint inventor: 

Residence: 

Post Office Address: 



Inventor s signature 



Citizenship: 



Date 



Full Name of # 6 joint inventor: 

Residence: 

Post Office Address: 



Inventor's Signature 



Citizenship: 



Date 



Full Name of # 7 joint inventor: 

Residence: 

Post Office Address: 



Inventor s Signature 



Citizenship: 



Date 



Full Name of # 8 joint inventor: 

Residence: 

Post Office Address: 



Citizenship: 



inventor s Signature 

Rev 11 /99 (DecPwr) 



Date 



{Use Page Two For Additional Snventor(s) Signature(s)) 
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